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Abstract 

I will discuss the recent proof that the complexity class NEXP (nondeterministic exponential time) 
lacks nonuniform ACC circuits of polynomial size. The proof will be described from the perspective of 
someone trying to discover it. 



1 Background 



^ ■ Recently, a new type of circuit size lower bound was proved MWillOllWilllll . The proof is a formal arrange- 

^ . ment of many pieces. This article will not present the proof. It will not present a series of technical lemmas, 

lyj I each lemma following from careful logical arguments involving the previous ones, culminating in the final 

result. This article is a discussion about how to discover the proof - a casual tour around it. Not all details 
will be given, but you will see where all the pieces came from, and how they fit together. The path will 
be littered with my own biased intuitions about complexity theory - what I think should and shouldn't be 
true, and why. Much of this intuition may well be wrong; however I can say it has led me in a productive 
^ ■ direction on at least one occasion. I hope this article will stimulate the reader to think more about proving 

lower bounds in complexity. 

The remainder of this section briefly recalls some (but not all of the) basics that we'll use. 

1.1 Uniform Complexity: Algorithms 

Recall that NTI ME denotes the class of languages (decision problems) solvable by nondeterministic 
^ ■ algorithms running in time t{n) on inputs of length n. So L € NTIME[f(?i)] provided that there is a nonde- 

^ . terministic algorithm such that for all strings x € L, there is a computation path of ?(|x[) steps that results 

in acceptance of x, and for x ^ L, every possible computation path of t{\x\) steps results in rejection of x. 
We define NEXP = U/t>o NTIME[2" ]. NEXP informally corresponds to problems with exponentially long 
solutions which are verifiable in exponential time. This class encompasses everything considered feasibly 
computable (and more): NP, PSPACE, EXP, etc. 

The nondeterministic time hierarchy IISFM78I IZak83ll says that, as one permits longer solutions to prob- 
lems and longer verification time for those solutions, one can always find strictly more problems with ver- 
ifiable solutions, under the constraints. In notation, NTIME[f(?i)] C NTIME[r(«)] for t{n + 1) < o{T{n)). 
One consequence of the nondeterministic time hierarchy is that NEXP 7^ NP HI 



*An earlier version of this article appears in SIG ACT News, September 201 1. 

^ Computer Science Department, Stanford University, Stanford, CA, USA. rrwilliams@gmail . com. At the time of writing, the 
author was supported by the Josef Raviv Memorial Fellowship at IBM Research - Almaden. 

' Actually, the usual deterministic time hierarchy suffices to prove this, using a padding/translation argument. 
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1.2 Nonuniform Complexity: Circuits 



With the usual uniform models of computation (Turing machines, lambda calculus, ;U-recursive functions, 
etc.), a function only counts as computable provided we can find a single algorithm in the model that com- 
putes the function on all possible finite inputs. Similarly, in complexity theory, a function is only efficiently 
computable if a single algorithm runs efficiently on all finite inputs. 

Suppose I allow you to run a different algorithm A„ for every distinct input length n. This amounts to 
having a countably infinite set of algorithms, which looks unrealistic^ But by permitting the length of a 
program to grow with the input length, we can more accurately model algorithms in practice that exploit 
the fact that there is an upper bound on the inputs they receive. Could there be a program for 3SAT with a 
billion lines of code that rapidly solves 3SAT on all formulas with less than a billion clauses? Nonuniform 
complexity can address questions of this form. 

We will imagine an infinite family of algorithms {A,,} as a family of logical circuits, where A„ takes n 
bits of input and returns a bit. (We'll work exclusively over the binary alphabet, for simplicity.) The classes 
of circuits considered in this article are, in increasing order of expressiveness: 

• AC°, the class of circuits with constant depth and polynomial size, having unbounded fan-in AND, 
OR, and NOT gates, 

• AC°[m], the class of circuits with constant depth and polynomial size, having unbounded fan-in 
MODm, AND, OR, and NOT gates (where a MOD/n gate outputs 1 iff the sum of its inputs is di- 
visible by m), 

• ACC, the union over all m of the classes AC*' [ni] Jl and 

• P/poly, the class consisting of arbitrary polynomial size Boolean circuits with bounded fan-in AND 
and OR gates, and NOT gates. 

(One can take the "size" of a circuit to be either the number of gates or the number of wires; for us, the 
choice won't matter.) We will identify the circuit classes above with their corresponding language classes, 
which consists of all decision problems solvable with an infinite family of such circuits. So, "NP C P/poly" 
states that every problem in NP can be solved with an infinite circuit family {C„} drawn from P/poly, where 
each Cn is run on inputs of length n. 

A routine fact is that P C P/poly: problems solvable in polynomial time by some algorithm can be 
solved by an infinite family of polynomial size circuits. Therefore, our restriction to considering circuits 
rather than arbitrary "growing" programs is actually without loss of generality: a poly (?i)-time program with 
poly(«) lines of code can be simulated by a noly(«)-size circuit on all w-bit inputs, although the underlying 
polynomials may not have the same degreesO 

^Indeed, one can solve undecidable problems using an infinite number of algorithms: let A„(l") output 1 iff the nth Turing 
machine halts on blank tape. This is a fact that is either true or false for a given n, so for each n we can make a very short efficient 
program A„ that captures this behavior. 

■'Note that an ACC circuit family doesn't necessarily have to be polynomial sized. By default, a circuit's size should be assumed 
polynomial unless otherwise specified. 

^We use the notation poly(n) to denote expressions of the form for some fixed k independent of n. 
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1.3 Ruling Out Polynomial Size Circuits for Uniform Complexity Classes 

What uniform computations can be simulated in P/poly? This is largely open. Randomized complexity 
classes like RP and BPP are in P/poly, but we do not believe that NP-complete problems can be solved 
with polynomial size circuit families. However, proving NP ^ P/poly is only stronger than P 7^ NP. The 
"smallest" uniform complexity class that we know is not contained in P/poly is MAexp> the exponential- 
time version of Merlin- Arthur games IIBFT98 I. Kabanets and Impagliazzo [ KI0 4| "almost" proved that the 
slightly smaller class NEXP'^^ (nondeterministic exponential time with an RP oracle) isn't in P/poly: either 
NEXP'^'^ doesn't have arithmetic polysize circuits, or it doesn't have (the usual Boolean) polysize circuits. 
Both MAexp and NEXP'^^ are enormous classes, containing NEXP and more. 

So while we can show that certain functions cannot have polynomial size circuits, those functions are 
extremely difficult for uniform algorithms to compute. But it could still be true that EXP'^'^ C P/poly. This 
looks crazy; if true, it would not only mean that every problem with exponentially long solutions can be 
solved with polynomial size circuits, but that every problem in EXP'^'^ has "highly compressible" solutions, 
representable with polynomial size circuits! Moreover, we will see in Section |5] that a similar result holds 
assuming NEXP C P/poly. 



1.4 ACC: The Frontier 

Can we make progress on P/poly lower bounds, by considering more restricted classes of circuits? In the 
space of this article we can only give a condensed history; much more can be found in the surveys IIA11961 
IVio09 i Furst, Saxe, and Sipser HFSSSII and Ajtai ||Ajt83| proved that simple functions such as the parity 



of n bits cannot be computed by polynomial size AC?'" circuits. (These results were later strengthened to 
exponential size IIYao851 lHas86l .) A natural next step was to grant AC° the parity function for free - 
resulting in the study of AC^p]. Razborov IIRaz87ll proved an exponential size lower bound for computing 
the majority of n bits in AC° [2]. Smolensky f Smo87ll proved exponential lower bounds for computing MOD^ 
with AC"[/7], for distinct primes p and q. Barrington |IBar89 | suggested the next step would be to prove lower 
bounds for the class ACC which allows for MOD,,, gates where m can be an arbitrary constant. 

Although it was conjectured over 20 years ago that the majority of n bits cannot be computed with 
ACC, strong ACC lower bounds have escaped proof. Suppose we grant AC" both the MOD2 and the MOD3 
function for free; this is equivalent to studying AC"[6], as seen by the equations 

M0D6(;ci,...,x„) = M0D3(;ci,...,x„) AM0D2(;ci,...,;c„), 
M0D2(xi,...,x„) = M0D6(xi,xi,xi,...,x„,x„,x„), 
M0D3(xi,...,x„) = M0D6(xi,xi,...,x„,x„). 

Even for this class, it was still possible that EXP^"^ c AC°[6]! Given that AC°[/j] was known to be very 
weak for every prime p, this was an extremely frustrating open problem - how could MOD^ be so much 
more powerful than MOD7? 

The recent paper [Willll finally rules out this ludicrous possibility. For example, we can prove that 
AC''[6] circuits for EXP'^'^ must necessarily have at least 2" size, for some 5 > that depends on the circuit 
depth. The proof extends to the (smaller) class NEXP as well, although there is some loss in the size lower 
bound. Nevertheless we can still rule out polynomial size ACC circuits for NEXP (even quasipolynomial 
size). The basic framework behind the proof is generic enough that it is reasonable to believe it can be 
extended to prove much stronger results: perhaps NEXP ^ P/poly, or NP ^z! ACC, or more. 
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2 Acquiring The Target 



Suppose we've set ourselves to finding a problem in NEXP that cannot be in ACC. What is a good NEXP 
problem to choose? The "hardest" possible candidates should be NEXP-complete ones - if there's a problem 
in NEXP\ ACC, then the complete ones are there! However, NEXP-completeness hasn't been studied nearly 
as much as NP-completeness, so the list of NEXP-complete problems doesn't appear to be terribly long. 
Nevertheless there is a natural way to construct NEXP problems out of NP problems, by focusing on the 
highly compressible instances of NP-complete problems. 

Given a problem n, we define the SUCCINCT n problem as follows. Let C be the set of all Boolean 
circuits with a single output gate over the gate basis AND/OR/NOT. For every C € C, let T{C) be the truth 
table of the function represented by C. More formally, letting n be the number of inputs to C, T (C) is the 2" 
bit string where T{C)[i\ = C{si), where Si is the /th n-bit string (in lexicographical order, say). 

Problem: Succinct n 
Given: A circuit C from C with n inputs and poly(n) size. 
Task: Determine whether T{C) is a yes-instance of n, i.e., T{C) G H. 

So in Succinct n, we only wish to solve the "highly compressible" instances of H: those 2" bit 
instances which are compressible to poly(?i)-bit representations as circuits. 

The definition may look odd at first, but studying succinct problems is something that many of us already 
do. Consider the OR problem: given a bit string x, does x contain a 1? This problem is trivial from the time 
complexity perspective, but still interesting on the circuit complexity level, as it is not known whether 
constant-depth circuits made entirely of MODg gates can compute OR efficiently IIHK09II . However, the 
SUCCINCT-OR problem is exactly the NP-complete Circuit Satisfiability problem: given a circuit, does its 
truth table contain a 1? So even the succinct versions of trivial problems are already interesting H 

What if n is an NP-complete problem? How hard is SUCCINCT n? There are nondeterministic expo- 
nential time algorithms for solving such problems: 

Proposition 1 Let Tl be a NP-complete problem that admits proofs of length £{n) for n-bit instances, with 
a verifier that runs in t(£) time on proofs of length i. SUCCINCT IT can be solved in t{i{2")) + 2" • poly{s) 
nondeterministic time, on circuits of size s with n inputs. 

Proof. Evaluate the given circuit on all of its possible inputs in 2" • poly {s) time, producing an instance of 
n of length 2". By assumption, the instance has a proof (if one exists) of length ^(2"), and the proof can 
be verified in f(£(2")). Nondeterministically guessing the £(2")-bit proof and verifying that proof yields the 
running time. □ 
As an example, consider the succinct version of 3SAT: 

Corollary 2.1 SUCCINCT 3SAT can be solved in nondeterministic 2" • {poly{s) +poly{n)) time on circuits 
ofn inputs and s size. 

^Encyclopedias could be written on succinct representations of problems in computer science. In complexity theory, succinctly 
represented problems are closely related to the structural notion of sparse sets; this is best illustrated by Hartmanis, Immerman, 
and Sewelson's theorem that TIME[2''("'] 7^ NTIME[2^("'] iff there is a sparse set in NP\ P |HIS85|. Implicit representations of 
graphs have been widely studied, and solving problems on them amounts to solving the succinct version of a graph problem. In 
other communities, BDDs (Binary Decision Diagrams) are the standard means for representing functions; many problems studied 
in that arena can be seen as succinct problems where the underlying circuit class C has been replaced with the set of BDDs. The 
Wikipedia articles on these (particular) topics are good starting points for further references. 
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Proof. Apply the above proposition. Here the proofs are satisfying assignments, which do not exceed 
the length of a formula, so i{n) < n. Verifying a satisfying assignment for a 2"-size 3-CNF can be done in 
0(2" • poly («)) time. □ 
Papadimitriou and Yannakakis IIPY86I showed that for all known NP-complete problems Yl, SUC- 
CINCT n is NEXP-complete. We state their result informally: 

Theorem 2.1 ( IIPY86II ) IfU is NP-complete under "ultra-efficient reductions" then SUCCINCT IT is NEXP- 
complete. 

Essentially what is needed in an "ultra-efficient reduction" is that each bit of the reduction's output can 
be computed from a polylogarithmic number of bits of the input, in polylogarithmic time. Now we have our 
pick of candidate NEXP problems: the succinct versions of NP-complete problems are fair game. 

What N P-complete problem could be more natural than 3SAT? It has been studied to death; the literature 
is filled with theorems on it. An attractive property of SUCCINCT 3SAT is that it's very NEXP-complete: 
there are super-ultra-efficient reductions from arbitrary languages in NEXP to SUCCINCT 3SAT instances. 
So there is little loss of generality in focusing on SUCCINCT 3SAT. 

Theorem 2.2 (Efficient Cook-Levin for NEXP) Succinct 3SAT is U^XP -complete under polynomial 
time reductions. Moreover, there is a polynomial time reduction Rfrom arbitrary L € NTIME[2"] to SUC- 
CINCT 3SAT with the properties: 

• X <^=^ R{^) £ Succinct 3SAT; i.e., R{x) is a circuit such that T(R{x)) encodes a satisfiable 
3-CNF formula. 

• R{x) is a circuit with poly{\x\) gates. 

• For all sufficiently long x, the number of inputs to the circuit R{x) is at most \x\ + 4 log \x\. 

The first two properties could be met rather straightforwardly, if the circuit R{x) were allowed to have 
up to C?([.x|) inputs. One of the many textbook proofs that 3SAT is NP-complete would suffice. We can 
convert a nondeterministic time t computation A{x) into a t'^^^ size 3-CNF formula, by first translating the 
computation of A(x) into a nondeterministic one-tape Turing machine M{x) running in time t^ and using 
space t, then building nt^ xt matrix where Tx{i,j) holds the content of the jth cell of M{x) at step / of 
its execution. (And if the head is reading cell j at step /, then Tx{i,j) also holds the state of M{x) at step /.) 
Note Tx is often called a tableau. (The particular value of c depends on the original computational model: if 
the model is multitape Turing machines, then c = 2 suffices.) 

Observing that every entry in can be determined from at most three other entries, we can generate 
constant size 3-CNF formulas, one for each entry of T^, such that their conjunction is satisfiable if and only 
if A{x) accepts. This 3-CNF formula generated is extremely regular, in that essentially the same group 
of clauses is produced repeatedly (with only minor changes in the variable indices). It follows that the 
clauses corresponding to entry T{i,j) can be efficiently produced with a poly(logf)-size circuit that is given 
{i,j) € [f^] X [t] as a (c+ l)logf + 0(l)-bit string. Whenf = 2",we obtain a poly(?i)-size circuit with about 
{c + l)n inputs. 

However, more efficient proofs of the Cook-Levin theorem exist, and the formulas obtained there have 
high redundancy too. Even for random access machines, there is a reduction from time-f computation to 
0{tlog^t) size formulas where the /th bit of the formula can be computed (given the integer / as an input) in 
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poly (log f) time IIC008 8 1 |Rob9 1 1 lFLvMV05ll . This corresponds to a reduction in the number of inputs to the 
circuit R{x), from (c+ l)|jc| down to \x\ +4 log \x\. There are several ways to achieve this kind of reduction, 
but unfortunately we do not have the space to include intuition for them; please consult the references above. 

We saw earlier that SUCCINCT 3SAT can be solved nondeterministically in 2"i'^('' time, on circuits with 
s gates and n inputs. Theorem 12.21 implies a time lower bound on how efficiently SUCCINCT 3SAT can be 
solved nondeterministically. 

Theorem 2.3 (Time Lower Bound for Succinct 3S AT) Succinct 3S AT cannot be solved in 2«-'B(i°g'') 
time (even with nondeterminism) on circuits with n inputs and poly{n) gates. 

Proof. Assume SUCCINCT 3SAT had a nondeterministic algorithm with the above running time. By the 
Cook-Levin Theorem for NEXP (Theorem I2.2I ). every «-bit instance of every L € NTIME[2"] can be re- 
duced in poly(?2) time to a SUCCINCT 3SAT circuit C with n + A\ogn inputs and poly(«) size. By assump- 
tion, the "succinct satisfiability" of C can be determined in 2("+^('°s"))""('°§")poly(?i) < o(2") time, with a 
nondeterministic algorithm. Therefore every L G NTIME[2"] is contained in the class NTIME[o(2")], i.e., 
NTIME[2"] C NTIME[o(2")]. But this contradicts the nondeterministic time hierarchy theorem IISFM781 
IZak83l which says NTIME[o(2")] C NTIME[2"]. □ 

So there is a concrete limitation on how efficiently SUCCINCT 3SAT can be solved, and it looks pretty 
strong. Could this result on the time complexity of SUCCINCT 3SAT be translated into a limitation on the 
circuit complexity? Let's think back to why we believe that separations like NEXP ^ ACC are true. We 
believe that problems in nondeterministic exponential time cannot be solved with polynomial size circuits, 
simply because exponentials grow much faster than polynomials. This is the main reason why we can 
diagonalize and prove NEXP 7^ NP, but this observation is not at all enough to prove a nonuniform lower 
bound against NEXP. We have to show that even if one were allowed infinite time to rig up infinitely many 
polynomial size circuits, each devoted to a separate input length n, one still cannot solve SUCCINCT 3SAT 
with this model. The diagonalization argument used in the proof of NEXP 7^ NP won't work here, and this 
is "provably" true. (More formally, the diagonalization argument works relative to every oracle, but there 
are oracles relative to which NEXP C P/poly.) 

Still, it is hard to let go of a strong feeling that polynomial size circuits simply contain too little infor- 
mation to carry out a full simulation of an exponential time computation. Although you are given a separate 
circuit for each input length, that little circuit is completely representative of some time-intensive function's 
behavior on an exponential number of inputs. A polynomial size circuit for a function means that the func- 
tion's truth table is highly compressible and regular In that sense, polynomial size circuits seem much closer 
to polynomial time algorithms than to exponential time algorithms. We'd like to say that, if there were small 
circuit families for a problem like SUCCINCT 3SAT, then there may as well be time efficient algorithms 
for Succinct 3SAT. That is, if Succinct 3SAT had polynomial size circuits, then these "short repre- 
sentatives" of exponential time computation may be discovered algorithmically in an efficient way. More 
generally, the mere existence of these short representatives should mean that SUCCINCT 3S AT has so much 
problem structure that this structure can also be exploited algorithmically. 

At this point we have reached a degree of handwaving so exuberant, one may fear we are about to fly 
away. Surprisingly, this handwaving has a completely formal theorem behind it: 

Theorem 2.4 (Spinning Circuits Into Algorithms llWillll ) //Succinct 3SAT can be solved with poly- 
nomial size ACC circuits, there is ane> such that SUCCINCT 3SAT can be solved by a nondeterministic 
algorithm running in 0{2'^^" ) time, on all circuits with n inputs and poly{n) size. 
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The contrapositive says that time lower bounds can be spun into circuit lower bounds. From Theorem l2.4l 
it follows readily that SUCCINCT 3SAT cannot have polynomial size ACC circuits, since the consequence 
of Theorem [23] contradicts Theorem 12.31 the time lower bound for Succinct3SAT. 

Corollary 2.2 SUCCINCT 3SAT does not have polynomial size ACC circuits, i.e., NEXP ^ ACC. 

So Theorem 12.41 is now our primary target. Why might it be true? How can we spin nonuniform circuits 
for Succinct 3SAT into a single uniform algorithm which beats 2" time? Since we can allow nondeter- 
minism in the algorithm, we could guess a polynomial size ACC circuit C that solves SUCCINCT 3SAT, 
then run C on our input. But how could we check that C correctly solves SUCCINCT 3SAT? Naively, we 
would need to check that on all 2" inputs x, C{x) = 1 iff T{x) is an exponentially long satisfiable 3-CNF 
formula. The time lower bound (Theorem 12.31 ) suggests this is impossible to do in less-than-2" time. 



3 Program Checking? 

We may try draw ideas from program checking, a topic introduced by Blum and Kannan f BK95ll . In program 
checking, one has a desired problem n in mind, and one is given a program P as a black box along with 
an input x. One wishes to efficiently determine if the output of P{x) equals Yl{x), i.e. if P reports a correct 
answer on x, by asking questions to P. More formally: 

Definition 3.1 A program checker C for a problem Yl and input x is a probabilistic polynomial time algo- 
rithm which is given black-box access to a program P and has the following properties for every P and 
x: 

• IfP correctly computes IT on all inputs, then C^{x) outputs the correct answer, with high probability^ 

• IfP{x) ^ n(x) then C^{x) outputs "fail" or the correct answer, with high probability. 

What problems n can be checked in this way? There has been extensive work on this question; cf. the 
work of [.GGHK R08I for a survey and recent results. SUCCINCT 3SAT doesn't seem to have a program 
checker, but there is another way in which SUCCINCT 3SAT can be efficiently checked. In a very influential 
paper, Babai, Fortnow, and Lund IIBFL91I proved that every NEXP problem 11 can be recognized by a 
probabilistic polynomial time (PPT) algorithm with access to an arbitrary oracle which is trying to "prove" 
that a given instance is in 11. More precisely, for every NEXP problem 11 there is a PPT algorithm A such 
that 

• if x^Yl then there is an oracle O such that Pr[A^(x) accepts] > 2/3, and 

• if X ^ n then for all oracles O, Pr[A'^(x) rejects] > 2/3. 

Informally, every 11 G NEXP has some PPT verifier A with exponentially long proofs that can be effi- 
ciently checked. Since the oracle O could only be asked expdxl*^) different queries over all possible runs of 
A^(x), it follows that for every x of length n, the corresponding oracle O can be represented by an exp(?i'^)-bit 
string encoding all the possible queries and answers of A^(x). Hence this proof verification model char- 
acterizes NEXP. This is encapsulated by the equation NEXP = MIP. (The class MIP stands for Multiple 
Interactive Provers, an equivalent model to the PPT algorithm with oracle access.) 

^The notation denotes C with black-box access (i.e., oracle access) to P. 
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Let's see what happens when we try to apply the NEXP = MIP theorem directly to our situation. Recall 
we want to derive a SUCCINCT 3SAT algorithm that is nondeterministic and runs in less than 2" time. 
Consider the algorithm: 

SatAlg(;c): 

Nondeterministically guess a poly(|x|)-size circuit C. 

Run a PPT algorithm A for checking SUCCINCT 3S AT on x, 

treating C as the oracle. 
lfA^{x) accepts then accept else reject. 

Unfortunately, SatAlg is not a correct nondeterministic algorithm: it takes a nondeterministic guess 
followed by a randomized computation A which could err when it rejects. So SatAlg(x) could have an 
accepting computation path, when x is in fact a no-instance of SUCCINCT 3S AT. Moreover, it is not known 
how to convert arbitrary (two-sided error) PPT algorithms into efficient nondeterministic ones. (Indeed, it is 
open whether BPP = NEXP; that is, probabilistic polynomial time could be as strong as nondeterministic 
exponential time!) Could we possibly remove the use of randomness in the checker? The NEXP = MIP re- 
sult no longer holds when you replace PPT algorithms with nondeterministic algorithms: a nondeterministic 
polynomial time algorithm that consults an oracle could be only as powerful as NP itself Q 

We seem to have failed to progress towards Theorem |2.4[ but here's a thought. Program checking has an 
inherently black-box aspect: we only study the input/output behavior of a program (or proof oracle, in the 
case of NEXP = MIP). But our particular box of interest (Succinct3Sat) is very special; by assumption, 
it can be modeled with a small ACC circuit. In a nondeterministic algorithm, we could guess this circuit and 
dissect its insides. Surely this extra information is useful. 

4 Black Boxes Versus Circuits 

What distinguishes a black box from a small circuit? If we could analyze circuits in a way which is provably 
better than analyzing black boxes, perhaps we could improve on what boxes can offer in the above. We can 
approach this improvement from two directions: try to find "easy" circuit-analysis problems, or try to find 
"hard" black-box-analysis problems. Or we could try both. 

Consider a black box that takes n inputs and prints a bit; we can query it repeatedly, and we have 
to determine some property of it. What is the hardest simple black-box problem? I would say that it is 
determining if the box will output a 1 on some input. If I want to determine this, an annoying adversary 
could simply answer "0" to all my queries until the last one. So this simple problem already requires 2" 
queries to solve - a hard black-box problem. 

Can we solve the problem more efficiently if we put circuits in place of black boxes? Replacing the 
black box with a circuit is precisely the Circuit Satisfiability problem. And if one were to define "more 
efficiently" to be "polynomial time" then this is the P versus NP question. We should tread lightly in this 
area of the jungle. Before proceeding further, let's do a sanity check on our line of thought, and assume the 
strongest "separation" between black-box hardness and circuit hardness. Let's assume Circuit SAT can be 
solved really efficiently, P = NP. Could we then prove our desired circuit lower bound? Yes. 

^One can simulate a nondeterministic algorithm-with-oracle in NP, by simply guessing an accepting computation path for the 
algorithm (along with prospective answers for the oracle queries along the way), then checking that the oracle answers are consistent 
with each other. If there is no oracle that makes the nondeterministic algorithm accept, then no accepting path can exist. However, if 
we considered co-nondeterministic algorithms with oracles, then we recover NEXP again. This point will be revisited in Section|5] 
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Theorem 4.1 (Karp-Lipton IIKL801 . attributed to Meyer) Suppose Circuit Satisfiability is in P. //'SUC- 
CINCT 3 SAT were solvable with polynomial size circuits, then all problems solvable in 2" time would be 
solvable in polynomial time (therefore, SUCCINCT 3SAT does not have polynomial size circuits, by the time 
hierarchy theorem). 

In fact, Meyer proves a stronger implication: if P = NP then EXP ^ P/poly. That is, assuming Circuit 
Satisfiability is in P, there is an exponential time computable function that doesn't have polynomial size 
circuits. 

The idea of the proof is to set up a fast (contradictory) simulation of every 2" time algorithm A, assuming 
both P = N P and EXP C P/poly. For simplicity let us assume A is a one-tape Turing machine; the argument 
can be generalized for other models. On an input x, nondeterministically guess a polynomial size circuit C 
that encodes the (exponentially long) computation history of A; that is, the truth table T{C) of C is a valid 
computation history of A(x). Such a C exists if EXP C P/poly. To verify C works correctly, we can 
universally (using co-nondeterminism) try all steps / and all tape cells j, and verify that C makes consistent 
claims about the content of cell j at step /, by comparing the claimed content at step / — 1 of cells j — I, j, 
and 7 + 1 . (As A is a one-tape machine, the content of cell j can only be affected by that of j — 1 , j, and j+l 
in the previous step.) This only requires evaluating C at four different pairs of indices: {ij), {i— 1,J — 1), 
(/ — 1, j), and (/ — 1,7 + 1), which can be done in polynomial time. If C makes consistent claims about (/, j) 
for every / and j, then our simulation accepts iff C claims that A(x) accepts. This is a.L2P computation, 
where we start with a nondeterministic guess and then universally verify our guess. But if P = NP then 
r2P = P, so we have simulated every 2" time algorithm A in polynomial time, a contradiction to the time 
hierarchy theorem. 

So the idea of using a circuit-analysis algorithm to prove a circuit lower bound has merit. But ugh... 
P = NP? Do we really need such a strong (probably false) algorithmic assumption? One can get away with a 
slower algorithm for Circuit SAT. We say that a function / : N — > N is "half-exponential" if f{f{n^Y) < 2"/^ 
for all k> I. Examples of half-exponential functions are f{n) = ?iP°'y('°g") and f{n) = 22''°'^''°°'°°" Carefully 
following the above argument, one can prove: 

Theorem 4.2 (Karp-Lipton IIKL80L attributed to Meyer) If Circuit Satisfiability is in half-exponential 
time, then EXP ^ P/poly. 

Now how plausible is this assumption? Unfortunately, it looks quite hard to find even a 2"^ time algorithm 
for Circuit SAT for some 8 < 1. (Note, 2" is much larger than half-exponential.) The state of the art in 
satisfiability algorithms is far from half-exponential time, although steady progress has been made since 
Monien and Speckenmeyer IIMS85II . They showed that for every k, there is an a^. < 1 such that ^-SAT is 
solvable in 2"'" time, but lim^t-^oo a^: = 1. Many improvements on the values of ttyt have been found over the 
years (e.g., IISch921 ISch021 IPPSZ051 IMSTTI It^rTTI ). but no one has found an algorithm for 3SAT that runs 
in 2"" time for every a > 0. The Exponential Time Hypothesis of Impagliazzo and Paturi IIIPOlll states that 
3-SAT (and hence. Circuit SAT) requires 2"" time for some a > 0, and a majority of researchers beUeve this 
hypothesis. 

5 Backtrack 

We have reached an impasse, so let's review how we got here. We wanted to prove that, if SUCCINCT 
3S AT can be solved in ACC, then we can design a faster-than-2" nondeterministic algorithm for SUCCINCT 
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3 SAT (a contradiction). We started by imagining a nondeterministic algorithm which guesses a polynomial 
size circuit for SUCCINCT 3SAT and checks correctness of that circuit, but that seemed impossible to 
efficiently implement directly. Using the ideas behind NEXP = MIP, we proposed an algorithm SatAlg 
that guesses an "oracle circuit" and verifies that, but the verification doesn't seem to be implementable 
nondeterministically. We concluded that the black-box nature of NEXP = MIP made it insufficient for our 
purposes, so we began looking for circuit-analysis problems that are easier than the corresponding black-box 
problems. Examining an argument of Karp-Lipton-Meyer, we found that a half-exponential algorithm for 
Circuit Satisfiability would imply circuit size lower bounds. But such an algorithm may not exist. Here are 
two observations: 

1. If we assume NEXP has small circuits, then all sorts of expensive computations can be captured with 
small circuits. So with a nondeterministic algorithm, we could always guess more circuits encoding 
additional information that may help verify other circuits. 

2. We have not yet used any particular properties of ACC circuits: all of our considerations would apply 
equally well for P/poly. 

Let's focus on the first point; the second will be handled later. Impagliazzo, Kabanets, and Wigder- 
son IIIKW02II proved that if NEXP C P/poly, then not only does SUCCINCT 3SAT have polynomial size 
circuits, but in fact for every circuit succinctly representing a satisfiable 3-CNF formula, there is another 
circuit succinctly representing a satisfying assignment for that formula. 

Let T (x) be the truth table of a string x, provided x is encoded as a circuit. (If x does not encode a valid 
circuit, let T{x) = 0^ .) For a circuit x, let be the 3-CNF formula encoded by T{x). (If T{x) does not 
encode a 3-CNF, let be the trivially false formula.) 

Theorem 5.1 ( IIIKW021 ) Suppose NEXP C P/poly. Then for every x e SUCCINCT 3SAT, there is a circuit 
Wx ofpoly{\x\) size and 0{\x\) inputs such that T{Wx) is a satisfying assignment to the formula Fx- 

The proof is an ingenious mixture of results on "hardness versus randomness" and good old-fashioned 
diagonalization; we do not have space to describe it here, but encourage the reader to take a look. It is not 
hard to show that if NEXP C ACC, then these "satisfying assignment circuits" can be assumed to also be 
ACC: 

Corollary 5.1 Suppose NEXP C ACC. Then for every x G SUCCINCT 3SAT, there is an ACC circuit Wx of 
poly{\x\) size and 0{\x\) inputs such that T{Wx) is a satisfying assignment to the formula Fx. 

Proof. NEXP C ACC imphes NEXP C P/poly, so every x G SUCCINCT 3SAT has a succinct satisfying 
assignment represented by a circuit, Wx. Since P c ACC, it follows that the CIRCUIT Value Problem 
has polynomial size ACC circuitsJl Therefore from Wx, there is an equivalent ACC circuit W^, obtained by 
plugging in an encoding of Wx into the inputs of an ACC circuit for the CIRCUIT Value Problem. □ 

(Notice again that we still have not used specific properties of ACC in the above proof.) Hence if NEXP 
were solvable with small ACC circuits, then every problem with an extremely long solution would always 
have some solution with an extremely efficient ACC representation. 

This prompts the idea: rather than guessing a circuit for SUCCINCT 3SAT, or an oracle circuit that's 
verifiable with randomness, why not guess a circuit encoding a satisfying assignment for our given instance? 
Perhaps this is easier to check. We are immediately led to: 

'^Recall the CIRCUIT VALUE PROBLEM is: given a circuit C and input x, does C{x) = \ ? 
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SatAlg2(;c): 

Nondeterministically guess a poly (|x|) -size circuit Wx- 

If T(Wx) encodes a satisfying assignment to Fx = T{x), then accept else reject. 

Verifying that Wx encodes a satisfying assignment to Fx can be done in exponential time, by evaluating Wx 
on all inputs, obtaining the string T(Wx), evaluating x on all inputs obtaining Fx, then checking that T{Wx) 
satisfies Fx. Provided NEXP C P/poly, SatAlg2 will correctly solve SUCCINCT 3SAT, by Theorem |5TT] 
Now the interesting question is, can SatAlg2 be implemented to run in 2"^"('°g") time, assuming NEXP C 
P/poly? If yes, we will have finally contradicted Theorem 12. 31 the time lower bound for SUCCINCT 3SAT. 

Checking that a variable assignment satisfies a 3-CNF formula can be done using an amount of workspace 
that is only logarithmic in the size of the formula and assignment. Hence SatAlg2 can be implemented to 
run in only polynomial space. Try all possible polynomial size circuits Wx, and for each Wx, run a logspace 
algorithm A for checking satisfiability as follows: when A needs a bit of Fx, evaluate the circuit x on the 
appropriate index; when A needs a bit of the assignment, evaluate Wx on the appropriate index. This way, 
we do not have to hold the entire formula or assignment in memory at once, and we'll take only polynomial 
space. So if we could solve this polynomial space problem faster than 2", we could solve SUCCINCT 3SAT 
in less than 2" time, getting a contradiction. 

This still looks algorithmically difficult to implement in faster than 2" time; can the complexity of check- 
ing be reduced even further? To verify that an assignment satisfies a 3-CNF formula, one checks for all 
clauses that the assignment satisfies at least one of three literals in the clause. We can "iterate over all 
clauses" by feeding different inputs into the circuit x. We can compute the three literals of a particular 
clause of Fx by evaluating x at inputs. We can compute the values of those three literals, under the 

assignment TiWx), by feeding three appropriate inputs into Wx. The picture of how to determine whether 
the /th clause of Fx is satisfied by T(Wx) looks like this: 



X 

III ^ III 

In this picture, the /th clause of Fx is (-izi Vz2 V -^z^,), and D{i) = 1 iff the variable assignment encoded 
by Wx satisfies the /th clause of the formula encoded by x. But for every /, x can be rigged to print the /th 
clause of Fx, which can then be checked against Wx. It follows that the circuit -iD is unsatisfiable if and only 
if the variable assignment encoded by Wx satisfies the 3CNF formula encoded by x. We have reduced the 
exponential time check of SatAlg2 to Circuit Satisfiabihty! Let's give a revised version of our SUCCINCT 
3SAT algorithm: 

SatAlg3(;c): 

Nondeterministically guess a poly -size circuit Wx. 
Construct the circuit D made up of x and Wx- 
Accept iff -iD is unsatisfiable. 
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If Circuit SAT is solvable in 2"^"('°s") time for poly (?i)-size circuits, then S AtAlg3 can be implemented 
to run in 2"""('°g") time. But SatAlg3 solves SUCCINCT 3SAT, contradicting Theorem 1231 We have 
established: 

Theorem 5.2 (I W illOII ) Assume Circuit SAT on circuits with n inputs and poly{n) size can be solved in 
2n-a)(iog«) jj-^g ji^^^ |\|EXP is not in P/poly. 

This assumption appears to be more plausible. The best known algorithms for general CNF-S AT IISch051 
IDHOSi IClPOa run faster than 2"-'»('''g"); there are even AC^-SAT algorithms that beat 2"-'»('°g'') IICIP06I 
IIMPIII . However it does not look easy to generalize these algorithms to unrestricted polynomial-size 
circuits. 

6 Enter ACC 

We are now ready to think about how to incorporate ACC into our arguments. We have found that faster Cir- 
cuit SAT (an algorithmic upper bound) implies SUCCINCT 3SAT is not in P/poly (a circuit lower bound). 
Informally, this is because "SUCCINCT 3SAT has small circuits" implies that we can guess small represen- 
tations of exponentially long information, and a faster Circuit SAT algorithm can help verify the correctness 
of the small representations. Together, the two result in a faster nondeterministic algorithm for SUCCINCT 
3 SAT, contradicting a known time lower bound. 

Ideally, one would hope that this upper bound / lower bound connection can be extended to other circuit 
classes, not just P/poly. For each circuit class C, we may define a con^esponding C-SAT problem: given 
a generic circuit from the class C, is it satisfiable? As mentioned above, very little is known about the 
worst-case time complexity of this problem. 

If we could design a faster-than-2" algorithm for C-SAT, that should intuitively help prove a lower bound 
against circuits from C : we have determined a property of circuits from C that is quantitatively easier than 
the corresponding property for black boxes. 

6.1 Spinning restricted Circuit SAT into restricted circuit lower bounds 

What goes wrong in SatAlg3 when we assume that only ACC-SAT can be solved in less than 2" time? 
Applying Corollary [5?T1 if NEXP C ACC then every satisfiable SUCCINCT 3SAT instance x (construed as a 
circuit) has a polynomial size ACC circuit W^' such that T{W^) is a satisfying assignment for = T{x), the 
formula encoded by x. This means we could guess an ACC circuit instead of Wx in SatAlg3. But what 
about the circuit x itself? There is no restriction on x, because the definition of SUCCINCT 3SAT lets x be 
arbitrary. So the resulting circuit D that we produce to check x and will be unrestricted as well. Hence 
an ACC-SAT algorithm won't necessarily run correctly on D. 

However, assuming P C ACC, Corollary 15.11 tells us that for every polysize circuit x, there exists an 
equivalent, polynomial size ACC circuit x' . Again, this is because the CIRCUIT Value Problem is in 
ACC, hence we can simulate the behavior of all unrestricted circuits using ACC circuits. So we could try to 
guess this ACC circuit x' , and use that in place of x in the construction of the circuit D. Then, the circuit D 
will have an ACC circuit x' composed with three copies of an ACC circuit W^', which will altogether be an 
ACC circuit. That is, we are proposing the following modification to SatAlg3: 
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SatAlg4(a;): 

Nondeterministically guess a poly (|x|) -size ACC circuit W^. 
Nondeterministically guess a poly -size ACC circuit x'. 
Verify that x and x' are equivalent (???) 
Construct the ACC circuit D made up of x' and W'. 
Accept iff -iD is unsatisfiable. 

SatAlg4 now checks the satisfiability of an ACC circuit, rather than an unrestricted circuit. By an 
argument analogous to what we gave for S atAlg3, a 2"-"('°g"' algorithm for ACC Circuit SAT for n-input 
poly(?i)-size circuits would appear to give our desired nondeterministic algorithm for SUCCINCT 3S AT. 

However, as the (???) indicates, there remains a hole to be filled in. We have to verify that the input x and 
our guess x' are really computing the same function. Can that be done with a faster ACC-SAT algorithm? 
The usual way of checking equivalence of x and x' would be to set up a Circuit SAT instance of the form 

£■(/) = (x{i) \/x'{i)) A {^x{i) V -^x{i)), 

and check if E is satisfiable. But this E contains a copy of the unrestricted circuit x, so E is also unrestricted! 
It seems hopeless to turn E into an ACC satisfiability question. We could guess an equivalent ACC circuit 
£■', but that wouldn't seem to help; we'd then have to verify that E is equivalent to E', and E,E' are only 
larger than x,x'. 

This is an annoying and impossible-looking problem. The key to solving it is to use the assumption 
that NEXP has small circuits, and guess small "helper" circuits in the SUCCINCT 3SAT algorithm. If 
NEXP C ACC then many types of functionality could be guessed in ACC form; if we choose the right 
functionality, our ACC circuit SAT algorithm can help verify the functionality. We must also avoid an 
infinite regress: eventually we must have some guessed circuits that can be directly checked for correctness. 

What else can we guess? We want to obtain an ACC x' that's provably equivalent to the input x, in the 
sense that both circuits produce the same outputs. But the output of a circuit is only one bit of information. 
Why not guess an ACC circuit that captures even more information about x? When we construe x as a circuit 
and evaluate it on input /, many bits of information are produced: on an input /, bit values are carried along 
every wire in x. Assuming P C ACC, these bits can be produced in ACC: there are poly (?i)-size ACC circuits 
C which take as input an (unrestricted) circuit x described in n bits, an input / to x, and an integer j, such that 

C(x, /, i) prints the value output by the jth gate of the circuit x, when x is evaluated on input /. 

(Determining this value can be done in polynomial time given {x,i,j), so if P C ACC then there are ACC 
circuits that can determine the value.) Provided we have a circuit C meeting the above specification, then 
for every / we have x(/) = C(x, /, j*'), where 7* is the index of the output gate of x. By setting x' = C(x, ■,]*), 
we have an ACC circuit equivalent to x. 

Supposing we guess this circuit C, we have to verify it is correct on our input x. At this point, it appears 
we have made our job only harder, since C takes strictly more inputs than our original guess x' did! But by 
forcing the guessed circuit to print more valid information about x, we can more easily verify that all the 
information is correct]^ For instance, if we find an AND gate j in x where the values output by C(x, /, •) 

'There is a similar principle behind error-correcting codes: by appending a message with more information about the message, 
one can still verify the content of the original message if some bits get flipped. The principle can also be seen in the technique 
of algebrization, where in order to better manipulate a Boolean function /(xj , . . . ,x„), one "lifts" / to a low-degree multivariate 
polynomial p which is equivalent to / on the set {0, 1}". Querying p on points outside of the set {0, 1}", one can often gain 
considerable advantages in verifying and manipulating /. 



13 



imply that j receives the inputs 1 and 0, but C{x, i, j) = 0, then we have detected an error in C. Conversely, if 
C(x, /, •) manages to make consistent claims about the inputs and outputs of every gate of x on input / (that is, 
OR gates always output the OR of their inputs, ANDs always output the AND of their inputs, NOTs always 
negate), then we know that C's claim about the final output of x{i) must also be correct. 

We can think of C as encoding a satisfying assignment to an exponentially large constraint satisfaction 
problem: for every input / to the circuit x and every gate j of x, the {ij) constraint is that C's claimed 
inputs to gate j in the evaluation of x{i) are consistent with C's claimed output of gate j. This constraint 
satisfaction problem has a succinct description - namely, the circuit x itself. Every circuit x with s gates can 
be represented as a set of tuples 

5v = {{j,h,j2,g) I 7 = 1, . . . ,s;juj2 < j;g G {AND, OR, NOT, INPUT}} . 

The tuple {j, j\ , j2,g) says that the 7th gate of x takes its inputs from the output of gate ji, the output of gate 
72, and j has gate type gu!j (For 7 = 1, . . . we use the convention that gate j corresponds to the j'th bit of 
input, so the integers ji and 72 equal 0, and g = INPUT.) From the set Sx, we can define a function Gx which 
takes j as input and prints the rest of the tuple {j\,j2,g) from Sx- Since the number of all possible inputs 
to Gx is only s (the number of gates in x), Gx can be implemented in Boolean logic with an 0(log |x|)-size 
collection of CNF formulas, each with C?(log \x\) variables and clauses. 

So to check that a guessed ACC circuit C is correct, we can use the following ACC circuit: 




The 0(l)-size circuit t takes bits ^71,^2 (from 71,72) a bit b (from 7), and a gate type g; t{bi,b2,b,g) = 1 
iffg{bi,b2)=b. For example, t{l,0,l,OR) = 1, since OR{1,0) = 1. 

For a given /, E{iJ) = 1 for all 7 iff C outputs the correct value of every wire in x{i). Now define 

n s 

E'{i) = /\[C{x,i,j)4^xj]A /\ E{i,j). 

j=\ j=n+\ 

(The first group of ANDs check that C represents the input gates correctly.) The circuit E' is also ACC, has 
exactly the same number of inputs as the circuit x, and -^E' is unsatisfiable iff C is correct on all inputs / to 
X. Our modified algorithm now looks like: 

'"without loss of generality, we may assume every gate of the circuit has at most two inputs. 
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SatAlg5(;c): 

Nondeterministically guess a poly(|x|)-size ACC circuit W'. 

Nondeterministically guess a poly(|;c|)-size ACC circuit C. 

Construct the collection of CNFs representing Gx- 

Construct the ACC circuit -^E' made up of C and Gx- 

If -<£' is satisfiable then reject 

j-k at this point, C must be correct ★/ 

Define x' = C{x, ■,]*) where j* corresponds to the output gate of x. 
Construct the ACC circuit D made up of x' and W. 
Accept iff -iD is unsatisfiable. 



We have finally reached a correct algorithm for SUCCINCT 3SAT (under the assumption that NEXP C 
ACC) with the property that if ACC satisfiabihty can be solved faster than 2", then SatAlgS can be imple- 
mented to run faster than 2", yielding our desired lower bound. 

Theorem 6.1 ( IWillll ) If ACC Circuit SAT can be solved on circuits with n inputs and n^ size in 2"^"('°g") 
time for every k, then NEXP is not contained in ACC. 

We only needed a few basic closure properties of ACC in the above argument: if you take a polynomially 
large AND of different ACC circuits, then the result is still an ACC circuit; given two ACC circuit families 
{C,,} and {Dn}, if you define a circuit family {£„} by the rule 

En (xi, . . . ,Xfi) — Cjjk (Dn ■ ■ ■ i^n) 1 ■ ■ ■ 1 Dn (-''-I i • • • ) -"-n ) ) 

for some fixed k, this "composition" of {C„} and is also an ACC circuit family. Most well-studied 
circuit classes satisfy these composition properties, so the above considerations apply to them as well: faster 
circuit satisfiability for a restricted class C entails lower bounds for solving problems in C. Intuitively, 
the difficulty faced by researchers who design fast algorithms for verification of certain kinds of circuits 
is related to the difficulty of proving that certain problems can 't be efficiently solved with these kinds of 
circuits. 

6.2 ACC Circuit Satisfiability 

It remains to prove that ACC circuit satisfiability really does have a faster algorithm. We can discover this 
algorithm by studying a known decomposition result for ACC circuits, from work initiated by Yao IIYao90ll . 
continued by Beigel and Tarui IIBT941 . Allender and Gore IIAG94L and Green et al. |GKRST95i . The 
decomposition result says that every ACC circuit family can be expressed as a family of functions 

{gn{hn{xi,...,X„))}, 

where h„ is a "sparse" multilinear polynomial, and gn is a "sparse" lookup table. 

Lemma 6.1 ( IIYao90l rBT94l [AG941 ) There is an algorithm and function / : N x N N such that given 
an ACC circuit C with MOD„, gates of n inputs, depth d, and size s, the algorithm outputs a function 
g : {0, . . . ,K} — > {0, 1} and a multilinear polynomial h{x\ , . . . ,x„) with K monomials, such that C = goh, 
where K = 2'^(^°s^' ' The algorithm takes at most 0{K) time. 
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Call ttiis transformation the polynomial decomposition for ACC. The function f{d,m) is estimated to 
be no more than m^^'^\ The high-level idea behind the decomposition is to first convert every OR and 
AND gate in the ACC circuit to low-degree polynomials (i.e., low fan-in ANDs of M0D2 gates) using 
randomness, "push" these low fan-in ANDs down to the bottom of the circuit, derandomize the construction 
using pairwise independence and a MAJORITY gate at the top, then use more sophisticated polynomial 
tricks to "push" the remaining layers of MOD gates into the top gate, which remains a symmetric function 
throughout the transformation. At the end, what remains is a symmetric function of a quasipolynomial 
number of ANDs, which can be represented by a g of in the above manner. Of course this is a very rough 
description; the reader should check the references for more details. 

What does this polynomial decomposition algorithm say about solving satisfiability for ACC? Razborov 
and Smolensky's lower bounds on f<C^[p\ can be seen as "approximations by polynomials" - they show that 
small AC°[/?] circuits can be approximated on many points by low-degree polynomials, so limitations on 
representing functions with low-degree polynomials can be ported over to limitations on AC'^[p]. Lemma l6?T] 
allows the polynomial h to output a number of possible values; those values are then filtered down to a single 
bit by another function g. While we may not approximate an ACC circuit very well with a polynomial, we 
can still simulate a great deal of the computation in an ACC circuit with a polynomial: after the evaluation 
of the polynomial h, we are only a ^-evaluation away from the ACC circuit's output. (In fact. Green et 
al. IIGKRST95II prove that g can be made a specific, simple function: the "middle bit" function.) 

Polynomials are nice, but what good do they serve for satisfiability algorithms? The short answer is: the 
Fast Fourier Transform. Less ambiguously, if we are given a multilinear polynomial in its coefficient rep- 
resentation (we are told the coefficients of the 2" possible monomials), then we can determine that polyno- 
mial's value on all points in {0, 1}", in only 0(2" •poly(?i)) time. That is, from the coefficient representation 
of the polynomial we can quickly compute the point representation. This is very nice; we are spending only 
poly(?i) time per evaluation point, even though our original polynomial could have been arbitrary - it could 
have 2" different coefficients ! 

There are several ways to derive a COEFFICIENT-To-POINT algorithm. Perhaps the most natural one is 
a recursive strategy. We are given a multilinear polynomial p{xi ,x„), and wish to compute a table T of 
2" entries such that 

r = [p(o,...,o,o),p(o,...,o,i), ,p(i,.. .,1,0), 7^(1, ...,1,1)]. 

If « = 1, we can return T = [p{0),p{l)] in unit time. When « > 1, because p is multilinear we can write it 
as 

p{xi,. . . ,X„) = Xiqi {X2,. . . ,Xn)+q2ix2,. ■ . ,x„). 

That is, we can split p into sums of monomials which include xi, and sums of monomials which do not 
include xi. Recursively calling our algorithm on qi and q2, we receive two tables Ti and T2 of 2"^^ num- 
bers each. Notice that p{0,X2,. ■ ■ ,x„) = q2{x2, - ■■ ,x„), and p{\,X2,... ,x„) = qi{x2, ■ ■ ■ ,Xn) + q2ix2, ■ ■ ■ i^n)- 
Therefore the corresponding 2" size table for p is 

r = [r2[l] , . . . , T2[2"-\ n [1] + T2[l],...,Ti [2"-'] + T2[2"-']] . 

The merging of tables can be done in 0(2" • poly («)) time, so the running time recurrence is 

/?(2") = 2-7?(2"-i) + 0(2"-poly(?i)), 

which solves to 0(2" •poly(?i))13 

' ' Here we are assuming that the sizes of coefficients in the polynomial p are negligible, so the bit-complexity of arithmetic does 
not play a significant role in the running time. This assumption is valid for the polynomials we are considering. 
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How can a Coefficient-To-Point algorithm lead to a faster ACC SAT algorithm? There seem to be 
two sticking points. 

1. The above works directly on a multiUnear polynomial h, but we need an algorithm that works for a g 
of an h. 

2. The above runs in 2" time, but we need a SAT algorithm that runs faster than 2". 

Addressing the first point is straightforward. We can evaluate h on all 2" points, and after we have 
produced the 2" table, we can determine all the distinct numbers in the table and check if some number 
makes g output 1 . Since his a. "sparse" polynomial, the total number of different numbers in T is "sparse" 
so this can be done in no more than 0(2" • poly(n)) time. 

The second point looks more difficult to overcome. To apply Coefficient-To-Point and get less- 
than-2" time, we need to work with a polynomial that has fewer than n variables. This would seem to 
require that our original ACC circuit has fewer than n inputs - something we are not willing to concede. 

There is a trick to circumvent this problem, and it exploits two observations. First, note the circuit 
satisfiability problem amounts to asking if the OR of some 2" circuits (with no free variables) evaluates to 
1. Second, if we take an OR of many copies of an ACC circuit of depth d, the result is an ACC circuit of 
depth d+l, because ACC circuits allow for OR gates of unbounded fan-in. 

Suppose we take a subset of k of the n inputs to an ACC circuit C, evaluate C on all 2^ possible values 
of this subset, then take the OR of these 2*^ circuit copies induced by the different evaluations. The resulting 
circuit C' has the properties: 

• C' has only n — k free inputs. 

• If C had size s, then C' has size 0{2^ ■ s). 

• C is still an ACC circuit (but with one more level of depth). 

• C is satisfiable iff C is satisfiable. 

Call this transformation the k-blowup of the circuit C. Basically, we have "brute-forced" the SAT problem 
for C on a A:-subset of the inputs to C. This blows up the size, but it decreases the number of input variables, 
something we are interested in doing, but with polynomials. However, because C' is an ACC circuit, we can 
still perform a polynomial decomposition on the circuit C' , then work with the underlying (n — A:)-variate 
polynomial. 

Now we are ready to stitch together the ACC satisfiability algorithm, which is given a circuit C with n 
inputs and s size. 

ACCSat(C): 

Letit = nV(2M'n))_ 

Compute C', the A:-blowup of C, which has size < 2*>s. 

Decompose C' into goh, 

where h has n — k variables and K = 2C(^^'''"'+iog'^*'' '"' ■«) monomials. 
Evaluate h on all 2"^* points in (9(2"^*poly(«) +K) time. 
Output satisfiahle lii goh equals 1 on at least one point. 

When s < 2"°*", we have K = i"'-"^""' and ACCSat runs in about i"-"'^"^"'"" time. 



17 



Theorem 6.2 ( llWillll ) ACC Circuit Satisfiability for subexponential size circuits can be computed in 2" ""^ 
tinte, for some £ > which depends on the depth and modulo gates of the input circuit. 

Combining this with Theorem lOl we conclude that SUCCINCT 3SAT is not in ACC. 

7 Further Directions 

There are two obvious directions to continue in: 

• Find an easier problem that is not in ACC. It is possible the ideas here may be extended to find 
an EXP problem (not just NEXP) which isn't in ACC. More precisely, faster C-SAT for a circuit 
class C ought to lead to EXP ^z! C Here is my extremely hand-wavy argument for this. Intuitively, 
a faster C-SAT algorithm reveals a weakness in representing computations with C circuits. The class 
C is not like a set of black boxes: these circuits cannot hide a satisfying input so easily. Moreover, a 
faster SAT algorithm for C highlights a strength of algorithms that run in less-than-2" time: they can 
solve nontrivial satisfiability problems on circuits from C. That is, my intuition is that a faster C-SAT 
algorithm shows "less-than-2" algorithms are strong" and "C -circuits are weak" - so perhaps 2" time 
can be separated from C -circuits using satisfiability algorithms alone. 

• Prove stronger circuit lower bounds for Succinct 3S AT. In order to separate N EXP from a circuit 
class C, we need only design faster satisfiability algorithms for C -circuits. In fact, it suffices to find 
a faster algorithm for the problem: given a circuit C £ C where you are promised that either C is 
unsatisfiable or C accepts 1/2 of its inputs, determine which is the casei^ Hence we only need to 
derandomize certain promise problems to establish the lower bounds. So far I have personally found 
it convenient to think about satisfiability directly, but eventually we will probably find the promise 
problem to be an easier chore. 

There are several other not-so-obvious directions. One possibility is to prove almost-everywhere ACC 
circuit lower bounds. Right now we can only say that ACC circuits can't solve SUCCINCT 3S AT on infinitely 
many input lengths. But we don't believe that some input lengths are inherently easier than others, so our 
lower bound ought to be extendable to all but finitely many input lengths. 

Another angle is to try proving new separations of uniform complexity classes. Can we prove N P is not 
equal to uniform ACC, where a single efficient algorithm given input 0" can construct the nth circuit in the 
family? Can some of the ideas here be used to finally prove NEXP 7^ BPP? 

Finally, much of our analysis centered around specific properties of SUCCINCT 3SAT. Might it be the 
case that other "Succinct" problems are useful for lower bounds, too? 

Acknowledgments. I thank Anup Rao for an inspiring discussion, and Virginia for her patience while I 
was finishing this article (we were supposed to be touring Barcelona, not circuit complexity). 
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